Background Bulk RNA sequencing is a powerful and cost-effective high-throughput technology that provides comprehensive insights into gene expression profiles. However, one key limitation is the lack of cellular resolution as bulk RNA-seq measures an average expression across all cell types in a sample. In complex biological systems, particularly in diseases where immunity plays a crucial role, a more granular understanding of the immune microenvironment is key to identifying novel therapeutic targets, improving patient stratification and guiding personalized treatment strategies.

Methods To estimate cellular compositions from bulk transcriptomics, we can use a conventional deconvolution approach, but challenges remain for less characterized tissues: lack of disease-specific immune signatures, background predictions and rigid references. To mitigate some of those limitations, we developed a novel strategy that integrates bulk RNA-seq and immune signatures derived from our newly generated single-cell CITE-seq dataset, using the single-sample Gene Set Enrichment Analysis (ssGSEA) scoring method.

By leveraging single-cell and bulk RNA-seq, we identify robust, disease-specific immune signatures and incorporate them into risk stratification models. Using myelodysplastic syndromes (MDS) as a case study, we show that this approach improves patient prognostication and identifies potential immunological targets.

Results We utilized bulk RNA-seq and clinical data from three independent MDS cohorts encompassing 723 (cohort 1), 324 (cohort 2) and 432 (cohort 3) diagnostic bone marrow (BM) samples. Patients were treatment-naïve at the time of sequencing. Cohorts 1 and 2 were used for primary analyses and cohort 3 for validation.

To estimate the relative immune cell content, we first computed immune signature scores using ssGSEA. These scores were highly correlated to cell-type proportions estimated by flow cytometry in 79 matched samples, improving upon other deconvolution methods.

We then used LASSO to select the most informative signatures to incorporate in a Cox proportional hazard model. By multiplying regression coefficients by their respective signature scores, we generated a single immune-based risk score for each patient.

We assigned risk categories to patients based on their risk scores and evaluated survival outcomes across the different categories. To pinpoint biological differences between these categories, we compared various signature scores, immune checkpoints and immune-related hallmark pathways.

Because our risk score is calculated independently of any other clinical data, it can be combined with other annotations to stratify patients on additional levels.

For example, while IPSS-M is a widely used prognostic scoring method for MDS, it does not consider the immune context. To address this knowledge gap, we used our immune risk score to further stratify patients within the three lowest IPSS-M categories into 3 immune risk groups, namely low, medium and high risk (median OS low=63mo, medium=80mo, high=107mo, p<2.9e-6).

We observed that a subset of lowest IPSS-M high-risk patients had a survival probability similar to that of higher IPSS-M patients, suggesting a refinement of prognostic classification. Notably, the newly identified “immune” high-risk group exhibited an increase in exhausted CD8 memory (p<1.0e-6) and effector T cells (p<2.3e-6), highlighting the relevance of immune dysfunction to disease progression.

We validated these findings in cohort 3, where we defined similar risk groups and recapitulated enrichment in immune cell populations.

Additionally, the immune characteristics of the identified low and high immune risk groups are in line with an independent study based on flow data (Riva et al., ASH2024, Presentation 665).

Lastly, PBMC samples matched with cohort 2 are being prepared as a second validation to determine whether similar markers and risk categories can also be identified in blood.

Conclusion We have developed a computational workflow that integrates high dimensional immune-related data extracted from bulk and single-cell RNA-seq into risk models, introducing a refined immune risk scoring method. This approach is highly flexible and can be easily combined with clinically established prognostic and classification systems. We have shown here how signatures derived from BM samples can be combined with IPSS-M, enhancing patient stratification and identifying patients more likely to respond to immune therapies.

This content is only available as a PDF.
Sign in via your Institution